An Approach to Reducing Annotation Costs for BioNLP

نویسندگان

  • Michael Bloodgood
  • K. Vijay-Shanker
چکیده

There is a broad range of BioNLP tasks for which active learning (AL) can significantly reduce annotation costs and a specific AL algorithm we have developed is particularly effective in reducing annotation costs for these tasks. We have previously developed an AL algorithm called ClosestInitPA that works best with tasks that have the following characteristics: redundancy in training material, burdensome annotation costs, Support Vector Machines (SVMs) work well for the task, and imbalanced datasets (i.e. when set up as a binary classification problem, one class is substantially rarer than the other). Many BioNLP tasks have these characteristics and thus our AL algorithm is a natural approach to apply to BioNLP tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Pattern Approach for Biomedical Event Annotation

We describe our approach for the GENIA Event Extraction in the Main Task of BioNLP Shared Task 2011. There are two important parts in our method: Event Trigger Annotation and Event Extraction. We use rules and dictionary to annotate event triggers. Event extraction is based on patterns created from dependent graphs. We apply UIMA Framework to support all stages in our system.

متن کامل

TEES 2.1: Automated Annotation Scheme Learning in the BioNLP 2013 Shared Task

We participate in the BioNLP 2013 Shared Task with Turku Event Extraction System (TEES) version 2.1. TEES is a support vector machine (SVM) based text mining system for the extraction of events and relations from natural language texts. In version 2.1 we introduce an automated annotation scheme learning system, which derives task-specific event rules and constraints from the training data, and ...

متن کامل

Annotation in Architecture: A Systematic Approach toward Mobilization and Development of Theoretical, Research, and Critical Basis in Architecture

Annotations usually refer to marginal notes that explain a difficult or ambiguous subject, provide a general definition or a critical remark for a particular part of a text. Historically, annotating was a well-known tradition in Islamic sciences and was used especially in times when there were less new potentials for generating new knowledge. The main question of this research is, can the tradi...

متن کامل

Biomedical Event Annotation with CRFs and Precision Grammars

This work describes a system for the tasks of identifying events in biomedical text and marking those that are speculative or negated. The architecture of the system relies on both Machine Learning (ML) approaches and hand-coded precision grammars. We submitted the output of our approach to the event extraction shared task at BioNLP 2009, where our methods suffered from low recall, although we ...

متن کامل

Active Learning for Coreference Resolution

Active learning can lower the cost of annotation for some natural language processing tasks by using a classifier to select informative instances to send to human annotators. It has worked well in cases where the training instances are selected one at a time and require minimal context for annotation. However, coreference annotations often require some context and the traditional active learnin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008